-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Space-precedence does not apply to value-level operators #10597
Conversation
@@ -841,13 +851,13 @@ fn operator_sections() { | |||
#[test] | |||
fn template_functions() { | |||
#[rustfmt::skip] | |||
test("_.map (_+2 * 3) _*7", block![ | |||
test("_.map (_ + 2*3) _*7", block![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of this test have changed, because the original syntax can no longer be expressed with template functions.
06a0360
to
35f3e1a
Compare
fd48f03
to
491518a
Compare
491518a
to
37790bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd rather see linting integrated into
enso
CLI interface - I don't see any way warnings are propagated to CLI however
- when at the spacing issues, don't you want to fix Expressions not being run if indented by 1 more space than current indentation level #9419?
@@ -44,7 +44,7 @@ class SimpleArithmeticTest extends InterpreterTest { | |||
val code = | |||
"""import Standard.Base.Data.Numbers | |||
|main = | |||
| expr = (10 * 3+2) - 1 | |||
| expr = 10 * (3+2) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a single fix in runtime-integration-tests
? Good.
@@ -640,15 +640,15 @@ spec_write suite_builder suffix test_sheet_name = | |||
parent = enso_project.data / "transient" / "nonexistent" | |||
parent.exists.should_be_false | |||
|
|||
f = parent / "foo."+suffix | |||
f = parent / ("foo."+suffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is unfortunate. It has nothing to do with arithmetic and looks really bad surrounded by (
and )
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we will have string interpolation, right? The parser supports it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please - lets agree how we want to do that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we will have string interpolation, right? The parser supports it.
Interesting. I didn't know we are supposed to support string interpolation. Where's the spec? What's the syntax? What Tree
will one get?
Both main x = 'x=$x'
and main x = "x=$x" resulted in a simple text literal on the IR level...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/04.%20Expressions.md#interpolated-text-literals
The elements returned by getElements
can return a TextElement.Splice
, but we flatten all of the elements into a string in TreeToIr to fit the current IR representation.
We also flatten in the GUI. When the backend supports splices, in the frontend we would probably want to display the spliced expression with widgets, but we would need to be careful that it is clear that it is part of the string expression.
@@ -50,6 +50,7 @@ exports[`Parsing '2 | |||
"spanLeftOffsetCodeStartLine": 1, | |||
"spanLeftOffsetCodeStartUtf8": 4, | |||
"type": "Number", | |||
"warnings": [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I read this correctly that warnings
are propagated to TypeScript. How are the warnings propagated to CLI? I don't see any changes in Tree
. How will the CLI report these warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is Tree.getWarnings
, however they are serialized as warning IDs (plus warning-specific details), with a separate table providing the string/template to display each warning ID. I have not yet implemented the JNI to turn this in to a getMessage
for the Warning
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to expose the linter via enso-engine*/bin/enso
we will need a way to access the warnings from JVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, expect a PR soon. I just wanted to get this part merged since it touches a lot of files.
const ENSO_EXTENSION: &str = ".enso"; | ||
|
||
pub fn cargo_run_linter_cmd(repo_root: &Path) -> Result<Command> { | ||
let mut ret = Cargo.cmd()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While having such a linter may satisfy our current development needs, I believe the right linter shall be integrated into the enso
command.
There is enso --compile
that is being invoked as part of CI already. We don't need a new CI task (in my opinion), but rather we should make sure that enso --compile
(maybe with --lint
) fails the CI when the formatting isn't correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important for the compiler to be able to emit syntax warnings. I think it's also useful to have a simple standalone linter--when linting fails in CI, it's nice if it happens right away, not after building the engine. It isn't a new CI task; it's part of ./run lint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's also useful to have a simple standalone linter
Indeed. However such a linter has to be available to Enso users, not just to us. E.g. it cannot be part of ./run lint
. It has to be part of enso-engine*/bin/enso
.
How will we expose the linter to Enso end users? What needs to be changed to make that happen? Can you report an issue describing what needs to be done? Can you implement it (and remove current ./run lint
temporary solution) or shall JVM guys in engine team do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standalone linter is only for us, to support our CI process. It can provide results before the engine is finished building. For that purpose, it cannot be part of enso-engine*/bin/enso
. End users will receive warnings from enso-engine*/bin/enso
.
We'll do that too. The separate linter came for free (I needed to write it to test the warnings and update the libs), and was needed for CI.
Line indentation and spacing between subexpressions are handled in different parts of the parser (block-building happens long before precedence-resolution). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the libs changes.
fn main() { | ||
let args = std::env::args().skip(1); | ||
let parser = enso_parser::Parser::new(); | ||
let parse_time: std::time::Duration = args | ||
.map(|path| { | ||
let code = std::fs::read_to_string(path).unwrap(); | ||
let mut code = code.as_str(); | ||
if let Some((_meta, code_)) = enso_parser::metadata::parse(code) { | ||
code = code_; | ||
} | ||
let start = std::time::Instant::now(); | ||
std::hint::black_box(parser.run(code)); | ||
start.elapsed() | ||
}) | ||
.sum(); | ||
println!("Parse time: {} ms", parse_time.as_millis()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, we could use standard rust benchmark tests, as we still use nightly rust. But it may be done later, of course.
#[derive(Clone, Debug, Eq, PartialEq, Serialize, Reflect, Deserialize)] | ||
#[allow(missing_copy_implementations)] // Future warnings may have attached information. | ||
pub struct Warning { | ||
id: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't use just WarningId here (And make WarningId #[repr(u32)]
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This representation is designed for FFI. If we used the parser in Rust I would use an enum with parameters, with serialization overridden to produce this representation. Since the Rust API is only used by debug tools, I cut out the Rust-friendly type.
/// Template strings for printing warnings. | ||
// These must be defined in the same order as the [`WarningId`] variants. | ||
pub const WARNINGS: [&str; WarningId::NUM_WARNINGS as usize] = | ||
["Spacing is inconsistent with operator precedence"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a function with a single match? It would not require NUM_WARNINGS
, and is easier to read (no longer need to care about proper order), and - if written properly - will optimize to array of &str anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to send the table across the FFI boundary, and then stringify warnings in the GC'd languages. This avoids FFI calls, encoding conversion, and allocations when emitting warnings. In the future we might move the warning table to a JSON file (which could be passed to some codegen) to further optimize this process and also make the messages easier to edit (or even localize).
pub struct ColdVec<T> { | ||
#[allow(clippy::box_collection)] | ||
#[reflect(as = "Vec<T>")] | ||
elements: Option<Box<Vec<T>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why box?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is how this struct fulfills its purpose. It has a shallow size of 1 pointer, and when unused that is its total size. A Vec
has a shallow size of 3 pointers, even if unused.
In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules. The "functional" operators are: `>> << |> |>> <| <<| : .`, application, and any operator containing `<-` or `->`. All other operators are considered value-level operators. Asymmetric whitespace can still be used to form *operator sections* of value-level operators, e.g. `+2 * 3` is still equivalent to `x -> (x+2) * 3`. Precedence of application is unchanged, so `f x+y` is still equivalent to `f (x + y)` and `f x+y * z` is still equivalent to `(f (x + y)) * z`. Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators. Closes #10366. # Important Notes Precedence warnings: - The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence. - A new enso linter can be run with `./run libraries lint`. It parses all `.enso` files in `distribution/lib` and `test`, and reports any errors or warnings. It can also be run on individual files: `cargo run --release --bin check_syntax -- file1 file2...` (the result may be easier to read than the `./run` output). - The linter is also run as part of `./run lint`, so it is checked in CI. Additional language change: - The exponentiation operator (`^`) now has higher precedence than the multiplication class (`*`, `/`, `%`). This change did not affect any current enso files. Library changes: - The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation). (cherry picked from commit e5b85bf)
Pull Request Description
In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules.
The "functional" operators are:
>> << |> |>> <| <<| : .
, application, and any operator containing<-
or->
. All other operators are considered value-level operators.Asymmetric whitespace can still be used to form operator sections of value-level operators, e.g.
+2 * 3
is still equivalent tox -> (x+2) * 3
.Precedence of application is unchanged, so
f x+y
is still equivalent tof (x + y)
andf x+y * z
is still equivalent to(f (x + y)) * z
.Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators.
Closes #10366.
Important Notes
Precedence warnings:
./run libraries lint
. It parses all.enso
files indistribution/lib
andtest
, and reports any errors or warnings. It can also be run on individual files:cargo run --release --bin check_syntax -- file1 file2...
(the result may be easier to read than the./run
output)../run lint
, so it is checked in CI.Additional language change:
^
) now has higher precedence than the multiplication class (*
,/
,%
). This change did not affect any current enso files.Library changes:
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.